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Abstract 


This document summarizes the major milestones in mobile Augmented Reality 
between 1968 and 2014. Mobile Augmented Reality has largely evolved over the 
last decade, as well as the interpretation itself of what is Mobile Augmented 
Reality. The first instance of Mobile AR can certainly be associated with the 
development of wearable AR, in a sense of experiencing AR during locomotion 
(mobile as a motion). With the transformation and miniaturization of physical 
devices and displays, the concept of mobile AR evolved towards the notion of 
’^mobile device^) aka AR on a mobile device. In this history of mobile AR we 
considered both definitions and the evolution of the term over time. 

Major parts of the list were initially compiled by the member of the Christian 
Doppler Laboratory for Handheld Augmented Reality in 2009 (author list in 
alphabetical order) for the ISMAR society. More recent work was added in 
2013 and during preparation of this report. 

Permission is granted to copy and modify. Please email the first author if you 
find any errors. 
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Introduction 


This document summarizes the major milestones in mobile Augmented Re¬ 
ality between 1968 and 2014. Mobile Augmented Reality has largely evolved 
over the last decade, as well as the interpretation itself of what is Mobile 
Augmented Reality. The first instance of Mobile AR can certainly be as¬ 
sociated with the development of wearable AR, in a sense of experiencing 
AR during locomotion (mobile as a motion). With the transformation and 
miniaturization of physical devices and displays, the concept of mobile AR 
evolved towards the notion of ’’mobile device”, aka AR on a mobile device. 
In this history of mobile AR we considered both definitions and the evolution 
of the term over time. 

Major parts of the fist were initially compiled by the member of the The 
list was compiled by the member of the Christian Doppler Laboratory for 
Handheld Augmented Reality^ in 2009 (author list in alphabetical order) 
for the ISMAR society. More recent work was added in 2013 and during 
preparation of this report. 

Permission is granted to copy and modify. Please email the first author 
if you find any errors. 


(a) Research 



(b) Mobile PC 


(c) Mobile Phone (d) Hardware 





(e) Standard 


(f) Game 


(g) Tool 


Figure 1: Icons used throughout this report for a rough categorization of 
related research, development and events. 


^CDL on Handheld AR: http://studierstube.org/haindheld_ar/ 
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(d) (e) 

Figure 2: (a): Sutherland’s system in [67] . (b): Conceptual Tablet Computer 
by Kay in 1972 [32], (c): First handheld mobile phone by Motorola in 

1973. (d): Caudell and Mizell coining AR in 1992 [8]. (e): IBM smartphone 
presented in 1992. 


1968 


Ivan Sutherland [67] creates the first augmented reality sys¬ 
tem, which is also the first virtual reality system (see Fig. 2(a) 
left). It uses an optical see-through head-mounted display that 
is tracked by one of two different 6DOF trackers: a mechanical tracker and 
an ultrasonic tracker. Due to the limited processing power of computers at 
that time, only very simple wireframe drawings could be displayed in real 
time. 


1972 


The first conceptual tablet computer was proposed in 1972 by 
Alan Kay, named the Dynabook [32]. The Dynabook was proposed 
as personal computer for children, having the format factor of a tablet with 
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a mechanical keyboard (really similar design from the One Laptop per Child 
project started in 2005). The Dynabook is probably recognized as being the 
precursor of the tablet computers decades before the iPad (see Fig. 2(b)). 

1973 


The first handheld mobile phone was presented by Motorola and 
demonstrated in April 1973 by Dr Martin Cooper [1]. The mobile 
named DynaTAC for Dynamic Adaptive Total Area Coverage was supporting 
only 35 minutes of call (see Fig. 2(c)). 

1982 

The first laptop, the Grid Compass^ 1100 is released, which was 
also the first computer to use a clamshell design. The Grid Compass 
1100 had an Intel 8086 CPU, 350 Kbytes of memory and a display 
with a resolution of 320x240 pixels, which was extremely powerful for that 
time and justified the enormous costs of 10.000 USD. However, its weight of 
5kg made it hardly portable. 

1992 



Tom Caudell and David Mizell coin the term ’’augmented reality” 
to refer to overlaying computer-presented material on top of the real 
world [8] (see Fig. 2(d)). Caudell and Mizell discuss the advantages of 
AR versus VR such as requiring less processing power since less pixels have to 
be rendered. They also acknowledge the increased registration requirements 
in order to align real and virtual. 

f At COMDEX 1992, IBM and BellSouth introduce the first smart¬ 
phone, the IBM Simon Personal Communicator^, which was released 
in 1993 (see Fig. 2(e)). The phone has 1 Megabyte of memory and a 
B/W touch screen with a resolution of 160 x 293 pixels. The IBM 
Simon works as phone, pager, calculator, address book, fax machine, and 
e-mail device. It weights 500 grams and cost 900 USD. 


^http://home.total.net/~hrothgar/museum/Compass/ 
^Wikipedia; http: //en.wikipedia. org/wiki/Simon_(phone) 
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(a) (b) (c) 

Figure 3: (a): Chameleon system proposed by Fitzmaurice [16]. (b): 

NAVSTAR-GPS goes live in 1993. (c): Apple Newton Message Pad 100. 

1993 



Loomis et al. develop a prototype of an outdoor navigation 
system for visually impaired [39]. They combine a note¬ 
book with a differential GPS receiver and a head-worn elec¬ 
tronic compass. The application uses data from a GIS (Geographic Informa¬ 
tion System) database and provides navigational assistance using an ” acous¬ 
tic virtual display”: labels are spoken using a speech synthesizer and played 
back at correct locations within the auditory space of the user. 



Fitzmaurice creates Chameleon (see Fig. 3(a)), a key exam- 
pie of displaying spatially situated information with a tracked 
hand-held device. In his setup the output device consists of a 
4” screen connected to a video camera via a cable [16]. The video camera 
records the content of a Silicon Graphics workstation’s large display in or¬ 
der to display it on the small screen. Fitzmaurice uses a tethered magnetic 
tracker (Ascension bird) for registration in a small working environment. 
Several gestures plus a single button allow the user to interact with the mo¬ 
bile device. Ghameleon’s mobility was strongly limited due to the cabling. 
It did also not augment in terms of overlaying objects on a video feed of the 
real world. 


In December 1993 the Global Positioning System (GPS, official 
name ’’NAVSTAR-GPS”) achieves initial operational capability (see 
Fig. 3(b)). Although GPS^ was originally launched as a military ser¬ 
vice, nowadays millions of people use it for navigation and other tasks such 
as geo-caching or Augmented Reality. A GPS receiver calculates its position 
by carefully timing the signals sent by the constellation of GPS satellites. 
The accuracy of civilian GPS receivers is typically in the range of 15 meter. 

^Wikipedia; http://en.wikipedia.org/wiki/Global_Positioning_System 



4 












(c) (d) (e) 

Figure 4: (a): Milgram Continuum [41], (b): Rekimotos NaviCam system 
[58]. (c): Rekimoto’s matrix marker [57]. (d) and (e): Touring Machine by 
Feiner et al. [15]. 


More accuracy can be gained by using Differential GPS (DGPS) that uses 
correction signals from fixed, ground-based reference stations. 


The Apple Newton Message Pad 100 was one of the earliest com¬ 
mercial personal digital assistant (PDA)^. Equipped with a 
stylus and handwritten recognition, and feature a screen in black and white 
of 336x240 pixels (see Fig. 3(c)). 


1994 

Steve Mann starts wearing a webcam for almost 2 years. From 
1994-1996 Mann wore a mobile camera plus display for almost every 
waking minute. Both devices were connected to his website allowing online 
visitors to see what Steve was seeing and to send him messages that would 
show up on his mobile display® 

Paul Milgram and Fumio Kishino write their seminal paper ’’Tax¬ 
onomy of Mixed Reality Visual Displays” in which they define the 
Reality-Virtuality Continuum [41] (see Fig. 4(a)). Milgram and 

®Wikipedia; http://en.wikipedia.org/wiki/MessagePad 
®S. Mann, Wearable Wireless Webcam, personal WWW page, wearcam.org 
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Kishino describe a continuum that spans from the real environment to the 
virtual environment. In between there are Augmented Reality, closer to the 
real environment and Augmented Virtuality, which is closer to the virtual 
environment. Today Milgram’s Continuum and Azuma’s definition (1997) 
are commonly accepted as defining Augmented Reality. 


1995 


Jun Rekimoto and Katashi Nagao create the NaviCam, a 
tethered setup, similar to Fitzmaurice’s Chameleon [58] (see 
Fig. 4(b)). The NaviCam also uses a nearby powerful worksta¬ 
tion, but has a camera mounted on the mobile screen that is used for optical 
tracking. The computer detects color-coded markers in the live camera im¬ 
age and displays context sensitive information directly on top of the video 
feed in a see-through manner. 


Benjamin Bederson introduced the term Audio Augmented Re- 
ality by presenting a system that demonstrated an augmentation 
of the audition modality [5]. The developed prototype uses a MD- 
player which plays audio information based on the tracked position of the 
user as part of a museum guide. 


1996 


Jun Rekimoto presents 2D matrix markers^ (square-shaped bar¬ 
codes), one of the first marker systems to allow camera tracking 
with six degrees of freedom [57] (see Fig.4(c)). 


1997 


Ronald Azuma presents the first survey on Augmented Reality. 
In his publication, Azuma provides a widely acknowledged definition 
for AR [4], as identified by three characteristics: 

• it combines real and virtual 

• it is interactive in real time 

• it is registered in 3D. 



^http://WWW.sonycsl.co.jp/person/rekimoto/matrix/Matrix.html 
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Steve Feiner et al. present the Touring Machine, the hrst mo¬ 
bile augmented reality system (MARS) [15] (see Fig. 4(d) and 
Fig. 4(e)). It uses a see-through head-worn display with integral orientation 
tracker; a backpack holding a computer, differential GPS, and digital radio 
for wireless web access; and a hand-held computer with stylus and touchpad 
interface®. 

Thad Starner et al. explore possible applications of mobile aug¬ 
mented reality, creating a small community of users equipped 
with wearable computers interconnected over a network [66]. 
The explored applications include an information system for 
offices, people recognition and coarse localization with infrared beacons. 

f Philippe Kahn invents the camera phone®, a mobile phone which is 
able to capture still photographs (see Fig.5(a)). Back in 1997, Kahn 
used his invention to share a picture of his newborn daughter with 
more than 2000 relatives and friends, spread around the world. Today more 
than half of all mobile phones in use are camera phones. 

Sony releases the Glasstron, a series of optical HMD (optionally 
see-through) for the general public. Adoption was rather small, 
but the affordable price of the HMD made it really popular in AR research 
labs and for the development of wearable AR prototype (see Fig. 5(b)). 

1998 

Bruce Thomas et al. present ” Map-in-the-hat”, a backpack- 
based wearable computer that includes GPS, electronic com¬ 
pass and a head-mounted display [71] (see Fig. 5(c)). At this 
stage the system was utilized for navigation guidance, but it later evolved 
into Tinmith, an AR platform used for several other AR projects^®. 

1999 





Hirokazu Kato and Mark Billinghurst present ARToolKit, a pose 
~ tracking library with six degrees of freedom, using square fiducials 
and a template-based approach for recognition [31]. ARToolKit is 
available as open source under the GPL license and is still very popular in 
the AR community (see Fig. 5(d)). 

®MARS: http://graphics .cs.Columbia.edu/projects/mars/mars.html 
^Wikipedia Camera Phone: http://en.wikipedia.org/wiki/Camera_phone 
^^Tinmith webpage: http://www.tinmith.net/ 
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(c) (d) (e) 

Figure 5: (a): Camera Phone Development by Kahn, (b): Sony Glasstron 
optical HMD in 1997. (c): Thomas et a/.’s Tinmith system [71]. (d): AR- 
ToolKit for pose tracking in 6DOF [31]. (e): Palm VII, the first consumer 
LBS device. 


Tobias Hollerer et al. develop a mobile AR system that allows 
the user to explore hypermedia news stories that are located at 
the places to which they refer and to receive a gnided campus 
tour that overlays models of earlier buildings [26] (see Fig. 6(a)). This 
was the first mobile AR system to use RTK GPS and an inertial-magnetic 
orientation tracker. 



Tobias Hbllerer et al. present a mobile augmented reality sys¬ 
tem that inclndes indoor user interfaces (desktop, AR tabletop, 
and head-worn VR) to interact with the ontdoor nser [27] (see 
Fig. 6(b)). While outdoor users experience a first-person spatialized mul¬ 
timedia presentation via a head-mounted display, indoor users can get an 
overview of the outdoor scene. 



Jim Spohrer publishes the Worldboard concept, a scalable in- 
— frastrncture to support mobile applications that span from low-end 
location-based services, up to high-end mobile AR [65]. In his paper, 
Spohrer also envisions possible application cases for mobile AR, and social 
implications. 
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(a) (b) (c) 

Figure 6: (a): Hdllerer et a/.’s MARS system [26]. (b): Hollerer et a/.’s user 
interface [27]. (c) Benfon Esc! NT2002, the first GSM phone with a built-in 
GPS sensor. 


The first consumer LBS device was the Palm VII, only support¬ 
ing zip code based location services (see Fig. 5(d)). 2 years later, 
different mobile operators provided different location based services using 
private network technology^^. 


f Benefon Esc! NT2002^^, the first GSM phone with a built-in 
GPS receiver is released in late 1999 (see Fig. 6(c)). It had a black 
and white screen with a resolution of 100x160 pixels. Due to limited 
storage, the phone downloaded maps on demand. The phone also included a 
friend finder that exchanged GPS positions with other Esc! devices via SMS. 


The wireless network protocols 802.11a/802.11b^^ - commonly known 
as WiFi - are defined. The original version - obsolete - specifies 
bitrates of 1 or 2 megabits per second (Mbit/s), plus forward error 
correction code. 


2000 

Bruce Thomas et al. present AR-Quake, an extension 
to the popular desktop game Quake [70] (see Fig. 7(a)). 
ARQuake is a first-person perspective application which 
is based on a 6DOF tracking system using GPS, a digital compass and vision- 
based tracking of fiducial markers. Users are equipped with a wearable com¬ 
puter system in a backpack, an HMD and a simple two-button input device. 
The game can be played in- or ontdoors where the nsual keyboard and mouse 
commands for movement and actions are performed by movements of the user 
in the real environment and nsing the simple input interface. 


Wikipedia: http://en.Wikipedia.org/wiki/Palm_VII 
^^http://www.benef on.de/products/esc/ 

^^Wikipedia: http: //en. wikipedia. org/wiki/802.11 
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(C) (d) 

Figure 7: (a): ARQuake by Thomas et al. [70]. (b): mPARD system by 
Regenbrecht and Specht [54], (c): BARS system by Julier et al. [29]. (d): 
First commercial camera phone in 2000. 


Regenbrecht and Specht present mPARD, using analogue 
wireless video transmission to a host computer which is taking 
the burden of computation off the mobile hardware platform 
[54] (see Fig. 7(b)). The rendered and augmented images are sent back to 
the visnalization device over a separate analog channel. The system can op¬ 
erate within 300m outdoors and 30m indoors, and the batteries allow for an 
uninterrupted operation of 5 hours at max. 



Fritsch et al. introduce a general architecture for large scale AR 
— system as part of the NEXUS project. The NEXUS model in- 
trodnces the notion of angmented world using distributed data man¬ 
agement and a variety of sensor system [17]. 


Simon Julier et al. present BARS, the Battlefield Augmented 
Reality System [29] (see Fig. 7(c)). The system consists of 
a wearable compnter, a wireless network system and a see- 
through HMD. The system targets the augmentation of a battlefield scene 
with additional information abont environmental infrastrncture, but also 
about possible enemy ambushes. 
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f Sharp corporation releases the first commercial camera phone 
to public (see Fig. 7(d)). The official name of the phone is J-SH04^^. 
The phones’ camera has a resolution of 0.1 megapixels. 

At ISAR, duller et al. described the problem of information overload 
" and visual clutter within mobile Augmented Reality [28] . They pro¬ 
posed information filtering for mobile AR based on techniques such 
as physically-based methods, methods using the spatial model of interac¬ 
tion, rule-based filtering, and a combination of these methods to reduce the 
information overload in mobile AR scenarios. 


2001 


Joseph Newman et al. present the BatPortal [49], a PDA- 
based, wireless AR system (see Fig. 8(a)). Localization is per¬ 
formed by measuring the travel time of ultra-sonic pulses be¬ 
tween specially built devices worn by the user, so-called Bats, and fixed 
installed receivers deployed in the floors ceilings building-wide. The system 
can support an HMD-based system, but also the more well known BatPortal 
using a handheld device. Based on a fixed configuration of the PDA carried 
and the personal Bat worn, the direction of the users view is estimated, and 
a model of the scene with additional information about the scene is rendered 
onto the PDA screen. 


Kara et al. introduce TOWNWEAR, an outdoor system that 
uses a fiber optic gyroscope for orientation tracking [62] (see 
Fig. 8(b)). The high precision gyroscope is used to measure the 
3DOF head direction accurately with minimal drift, which is then compen¬ 
sated by tracking natural features. 

f Jurgen Fruend et al. present AR-PDA, a concept for build- 

-ing a wireless AR system and a special prototype of palm-sized 

hardware [18] (see Fig. 8(c)). Basic design ideas include the 
augmentation of real camera images with additional virtual objects, for ex¬ 
ample for illustration of functionality and interaction with commonly used 
household equipment. 

Reitmayr and Schmalstieg present a mobile, multi-user AR 
system [55] (see Fig. 8(d)). The ideas of mobile augmented 
reality and collaboration between users in augmented shared 
space are combined and merged into a hybrid system. Communication is 

^^http://k-tai.impress.co.jp/cda/article/showcase_top/3913.html 
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(f) (g) 

Figure 8: (a): BatPortal by Newman et al. [49]. (b): TOWNWEAR system 
by Kara et al. [62], (c): Wireless AR setup concept by Fruend et al. [18]. (d): 
Multi-user AR system by Reitmayr and Sclimalstieg [55] . (e): ARCHEOGU- 
IDE by Flatiakis et al. [73]. (f): Mobile AR restaurant guide by Bell et al. 
[6]. (g): First AR browser by Kooper and MacIntyre [35]. 
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performed using LAN and wireless LAN, where mobile users and stationary 
users are acting in a common augmented space. 

Vlahakis et al. present Archeoguide, a mobile AR sys¬ 
tem for cultural heritage sites [73] (see Fig.8(e)). The sys¬ 
tem is built around the historical site of Olympia, Greece. 
The system contains a navigation interface, 3D models of ancient temples and 
statues, and avatars which are competing for the win in the historical run 
in the ancient Stadium. While communication is based on WLAN, accurate 
localization is performed using GPS. Within the system a scalable setup of 
mobile units can be used, starting with a notebook sized system with HMD, 
down to palmtop computers and Pocket PGs. 

Kretschmer et al. present the GEIST system, a system for 
interactive story-telling within urban and/or historical envi¬ 
ronments [36] . A complex database setup provides information 
queues for the appearance of buildings in ancient times or historical facts 
and events. Gomplex queries can be formulated and stories can be told by 
fictional avatars or historical persons. 

Columbia’s Computer Graphics and User Interfaces Lab does 
an outdoor demonstration of their mobile AR restaurant guide 
at ISAR 2001, running on their Touring Machine [6] (see 
Fig.8(f)). Pop-up information sheets for nearby restaurants are overlaid on 
the user’s view, and linked to reviews, menus, photos, and restaurant URLs. 

Kooper and MacIntyre create the RWWW Browser, a mobile 
AR application that acts as an interface to the World Wide Web 
[35] (see Fig. 8(g)). It is the first AR browser. This early 
system suffers from the cumbersome AR hardware of that time, requiring 
a head mounted display and complicated tracking infrastructure. In 2008 
Wikitude implements a similar idea on a mobile phone. 

2002 

Michael Kalkusch et al. present a mobile augmented reality 
system to guide a user through an unfamiliar building to a 
destination room [30] (see Fig. 9(a)). The system presents a 
world-registered wire frame model of the building labeled with directional in¬ 
formation in a see-through heads-up display, and a three-dimensional world- 
in-miniature (WIM) map on a wrist-worn pad that also acts as an input 
device. Tracking is done using a combination of wall-mounted ARToolkit 
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(e) (f) (g) 

Figure 9: (a): Navigation system by Kalkusch et al. [30]. (b): ARPad by 
Mogilev et al. [43]. (c): Human Pacman by Cheok et al. [9]. (d): iLamps 
system by Raskar et al. [53]. (e): Indoor AR guidance system by Wagner 
and Schmalstieg [77]. (f) Siemens SXl AR game ’’Mozzies”. (g): Mobile 
Authoring system by Guven and Feiner [22]. 
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markers observed by a head-mounted camera, and an inertial tracker. 


Leonid Naimark and Eric Foxlin present a wearable low- 
power hybrid visual and inertial tracker [46]. This 
tracker, later to be known as Inter Senses IS-1200, can be used 
for tracking in large scale, such as a complete building. This is achieved by 
tracking a newly designed 2-D barcode with thousands of different codes and 
combining the result with an inertial sensor. 

__ Mogilev et al. introduce the AR Pad, an ad-hoc mobile AR 


—^ device equipped with a spaceball controller [43] (see Fig 9(b)). 


2003 

Adrian David Cheok et al. present the Human Pacman [9] 
Human Pacman is an interactive ubiquitous 
and mobile entertainment system that is built upon position 
and perspective sensing via Global Positioning System and inertia sensors; 
and tangible human-computer interfacing with the use of Bluetooth and ca¬ 
pacitive sensors. Pacmen and Ghosts are now real human players in the 
real world experiencing mixed computer graphics fantasy-reality provided by 
using wearable computers that are equipped with GPS and inertia sensors 
for players’ position and perspective tracking. Virtual cookies and actual 
tangible physical objects with Bluetooth devices and capacitive sensors are 
incorporated into the game play to provide novel experiences of seamless 
transitions between real and virtual worlds. 

Ramesh Raskar et al. present iLamps [53] (see Fig. 9(d)). This 
work created a first prototype for object augmentation with a 
hand-held projector-camera system. An enhanced projector 
can determine and respond to the geometry of the display surface, and can 
be used in an ad-hoc cluster to create a self-configuring display. Furthermore 
interaction techniques and co-operation between multiple units are discussed. 

Daniel Wagner and Dieter Schmalstieg present an indoor AR 
guidance system running autonomously on a PDA [77] (see 
Fig. 9(e)). They exploit the wide availability of consumer 
devices with a minimal need for infrastructure. The application provides the 
user with a three-dimensional augmented view of the environment by using 
a Windows Mobile port of ARToolKit for tracking and runs directly on the 
PDA. 
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Figure 10: (a): Tracking 3D markers by Mohring et al. [44], (b): Visual 
Codes by Robs and Gfeller [59]. (c): OSCAR system by Coelho et al. [10]. 
(d): The Invisible Train [75]. 



The Siemens SXl is released, coming with the first commercial 
mobile phone AR camera game called Mozzies (also known as 
Mosquito Hunt) (see Fig. 9(f)). The mosqnitoes are snperim- 
posed on the live video feed from the camera. Aiming is done by moving the 
phone around so that the cross hair points at the mosquitoes. Mozzies was 
awarded the title of best mobile game in 2003. 


Sinem Guven presents a mobile AR authoring system for creat- 
ing and editing 3D hypermedia narratives that are interwoven with 
a wearable compnter user’s surrounding environment^^ [22] (see Fig. 
9(g)). Their system was designed for anthors who are not programmers and 
used a combination of 3D drag-and-drop for positioning media and a timeline 
for synchronization. It allowed authors to preview their results on a desktop 
workstation, as well as with a wearable AR or VR system. 

^®http://graphics.cs.Columbia.edu/proj ects/mars/Authoring.html 
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2004 



Mathias Mohring et al. present a system for tracking 3D 
— markers on a mobile phone [44] (see Fig. 10(a)). This work 
showed a first video see-through augmented reality system on 
a consumer cell-phone. It supports the detection and differentiation of dif¬ 
ferent 3D markers, and correct integration of rendered 3D graphics into the 
live video stream. 


Michael Rohs and Beat Gfeller present Visual Codes, a 2D 
marker system for mobile phones [59] (see Fig. 10(b)). These 
codes can be attached to physical objects in order to retrieve 
object-related information and functionality. They are also suitable for dis¬ 
play on electronic screens. 



Enylton Machado Coelho et al. presents OSGAR, a scene graph 
with uncertain transformations [10] (see Fig. 10(c)). In their work 
they target the problem of registration error, which is especially im¬ 
portant for mobile scenarios when high quality tracking is not available and 
overlay graphics will not align perfectly with the real environment. OSGAR 
dynamically adapts the display to mitigate the effects of registration errors. 



The Invisible Train, is shown at SIGGRAPH 2004 Emerging 
Technologies^® (see Eig. 10(d)). The Invisible Train is the first 
multi-user Augmented Reality application for handheld devices 
[75]. 


2005 


Anders Henrysson ports ARToolKit to Symbian [23] (see 
Eig. 11 (a)). Based on this technology he presents the famous 
AR-Tennis game, the first collaborative AR application run¬ 
ning on a mobile phone. ARTennis was awarded the Indepdent Mobile Gam¬ 
ing best game award for 2005, and the technical achievement award. 



f Project ULTRA shows how to use non-realtime natural fea- 

^ — ture tracking on PDAs to support people in multiple domains 
such as the maintenance and support of complex machines, 
construction and production, and edutainment and cultural heritage [40]. 
Furthermore an authoring environment is developed to create the AR scenes 
for the maintenance tasks. 


^®The Invisible Train: http://studierstube.icg.tugraz.at/invisible_train/ 
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(a) (b) (c) 

Figure 11: (a): AR-Tennis by Henrysson et al. [23]. (b): Going Out by 
Reitmayr and Drummond [56]. (c): Mara system by Nokia in 2006. 


f The first mobile phones equipped with three-axis accelerom¬ 
eters were the Sharp V603SH and the Samsung SCH-S310 both sold 
in Asia in 2005. 


2006 

Reitmayr and Drummond present a model-based hybrid track¬ 
ing system for outdoor augmented reality in urban envi¬ 
ronments enabling accurate, real-time overlays on a handheld 
device [56] (see Fig. 11(b)). The system combines an edge-based tracker for 
accurate localization, gyroscope measurements to deal with fast motions, 
measurements of gravity and magnetic field to avoid drift, and a back store 
of reference frames with online frame selection to re-initialize automatically 
after dynamic occlusions or failures. 

f Nokia presents Mara, a multi-sensor mobile phone AR 

guidance application for mobile phones^^. The prototype ap¬ 
plication overlays the continuous viewfinder image stream cap¬ 
tured by the camera with graphics and text in real time, annotating the 
user’s surroundings (see Fig. 11(c)). 



^^Mara: http: //research. nokia. com/page/219 
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(d) (e) 

Figure 12: (a): PTAM by Klein and Murray [33]. (b): Groundcam by 

DiVerdi and Hollerer [12], (c): Map Navigation with mobile devices by Rohs 
et al. [60]. (d): Apple iPhone 2G. (e): AR advertising app by HIT Lab NZ 
and Saatchi. 


2007 

Klein and Murray present a system capable of robust real¬ 
time tracking and mapping in parallel with a monocular 
camera in small workspaces [33] (see Fig. 12(a)). It is an 
adaption of a SLAM approach which processes the tracking and mapping 
task on two separated threads. 

DiVerdi and Hbllerer present the GroundCam, a system com¬ 
bining a camera and an orientation tracker [12] (see Fig. 12(b)). 
The camera points at the ground behind the user and provides 
2D tracking information. The method is similar to that of an optical desktop 
mouse. 





Rohs et al. compare the performance of the following naviga- 
tion methods for map navigation on mobile devices: joystick 
navigation, the dynamic peephole method without visual con¬ 
text, and the magic lens paradigm using external visual context [60] (see Fig. 
12(c)). In their user study they demonstrate the advantage of dynamic peep¬ 
hole and magic lens interaction over joystick interaction in terms of search 
time and degree of exploration of the search space. 


19 

















(a) (b) (c) 


Figure 13: (a): Real-time natural feature tracking on mobile phones by 
Wagner et al. [76]. (b): Commercial AR museum guide by METAIO [42], 
(c): Wikitude AR Browser. 


f The first multi-touch screen mobile phone, famously known as 
iPhone sold by Apple, leverages a new way to interact on mobile 
devices (see Fig. 12(d)). 

HIT Lab NZ and Saatchi deliver the world’s first mobile phone 
based AR advertising application for the Wellington Zoo [79] (see 
Fig. 12(e)). 

2008 

f Wagner et al. present the first 6DOF implementation of 

natural feature tracking in real-time on mobile phones 

achieving interactive frame rates of up to 20 Hz [76] (see Fig. 
13(a)). They heavily modify the well known SIFT and Ferns methods in 
order to gain more speed and reduce memory requirements. 

METAIO presents a commercial mobile AR museum 
guide using natural feature tracking or a six-month exhibi¬ 
tion on Islamic art [42] (see Fig. 13(b)). In their paper they 
describe the experiences made in this project. 

With Augmented Reality 2.0, Schmalstieg et al. presented at the 
Dagstuhl seminar in 2008 for the first time a concept that combined 
ideas of the Web 2.0 such as social media, crowd sourcing through 
public participation, and an open architecture for content markup and dis¬ 
tribution, and applied it to mobile Augmented Reality to create a scalable 
AR experience [63] . 
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Mobilizy launches Wikitude^®, an application that combines 
GPS and compass data with Wikipedia entries. The Wikitude 
World Browser overlays information on the real-time camera 
view of an Android smartphone (see Fig. 13(c)). 

2009 

f Morrison et al. present MapLens which is a mobile augmented 

reality (AR) map using a magic lens over a paper map [45] 
(see Fig. 14(a)). They conduct a broad user study in form 
of an outdoor location-based game. Their main finding is that AR features 
facilitate place-making by creating a constant need for referencing to the 
physical. The field trials show that the main potential of AR maps lies in 
their use as a collaborative tool. 

Hagbi et al. presented an approach allowing to track the pose of 
the mobile device by pointing it to fiducials [7] (see Fig. 14(b)). 
Unlike existing systems the approach allows to track a wide set 
of planar shapes while the user can teach the system new shapes at rnntime 
by showing them to the camera. The learned shapes are then maintained 
by the system in a shape library enabling new AR application scenarios in 
terms of interaction with the scene but also in terms of fiducial design. 

Sean White introduces SiteLens (see Fig. 14(c)), a hand-held 
mobile AR system for urban design and urban planning 
site visits [78]. SiteLens creates ’’situated visualizations” that 
are related to and displayed in their environment. For example, represen¬ 
tations of geocoded carbon monoxide concentration data are overlaid at the 
sites at which the data was recorded. 

SPRXmobile launches Layar^®, an advanced variant of Wiki¬ 
tude (see Fig. 14(d)). Layar uses the same registration mecha¬ 
nism as Wikitnde (GPS -|- compass), and incoperates this into 
an open client-server platform. Content layers are the equivalent of web 
pages in normal browsers. Existing layers include Wikipedia, Twitter and 
Brightkite to local services like Yelp, Trulia, store locators, nearby bus stops, 
mobile coupons, Mazda dealers and tourist, nature and cultural guides. On 
Augnst 17th Layar went global serving almost 100 content layers. 


Wikitude: http://www.mobilizy.com/wikitude.php?lang=en 
^®LayAR: http://layar.eu/ 
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(e) (f) 

Figure 14: (a): MapLens by Morrison et al. [45]. (b) Hagbi’s pose tracking 
using shape [7]. (c): SiteLens by White and Feiner [78]. (d): LayAR AR 
browser, (e): ARhrrrr! Zombie game by Spreen et al. from Georgia Tech, 
(f): Klein’s PTAM system running on an iPhone [34]. 
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Kimberly Spreen et al. develop ARhrrrr!, the first mobile AR 
game with high quality content at the level of commercial 
games[80] (see Fig. 14(e)). They use an NVIDIA Tegra devel¬ 
oper kit (’’Concorde”) with a fast GPU. All processing except for tracking 
are running on the GPU, making the whole application run at high frame 
rates on a mobile phone class device despite the highly detailed content and 
natural feature tracking. 

Georg Klein presents a video showing his SLAM system run¬ 
ning in real-time on an iPhone [34] (see Fig. 14(f)) and 
later presents this at ISMAR 2009 in Orlando, Florida. Even 
though it has constrains in terms of working area it is the hrst time a 6DoF 
SLAM system is known to run on mobile phones in sufficient speed. 



Update April 2015: The following parts of the document until beginning 
of 2015 cover the years since the last homepage update, following the same 
categorization and scheme as before. 

From end of 2009 onwards, AR research and development is generally driven 
by high expectations and huge investments from world-leading companies 
such as Microsoft, Google, Facebook, Qualcomm and others. At the same 
time, the landscape of mobile phone manufacturers started to change radi¬ 
cally. 

In general the advances in mobile device capabilities introduce a strong drive 
towards mobile computing, and the availability of cloud processing further 
supports the proposal and development of server-client solutions for AR pur¬ 
poses. One major trend starting around 2010, originating by the work of 
Davison in 2003 [11] and later further explored by Klein and Murray [33, 34], 
is the heavy use of SLAM in AR, which still continues to dominate a major 
part of AR research and development as of beginning of 2015. 


Microsoft presents ’’Project Natal” at the game exhibition E3. It 
is the hrst version of a new hardware interface, consisting of 
motion detection technology, microphone, color camera and software, 
to be integrated into the game console Xbox 360. 

At ISMAR 2009, Glemens Arth et al. present a system for 
large-scale localization and subsequent 6DOF tracking 
on mobile phones [3]. The system uses sparse point clouds of 
city areas and FAST corners and SURF-like descriptors that can be used on 
memory-limited devices (see Fig. 15(a)). 
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Qualcomm Inc. acquires the mobile 

Computer Services GmbH., Vienna, and 


AR IP from Imagination 
takes over the funding of 


the Christian Doppler Laboratory for Handheld AR at Graz Univer¬ 
sity of Technology. A research center to focus on AR is opened later in 2010 
in Vienna [81]. 


2010 



of panoramic 
Fig. 15(b)). 


A real-time panoramic mapping and tracking system 
for mobile phones is presented by Wagner et al. at VR, which 
performs 3DOF tracking in cylindric space and supports the use 
imagery for improved usability and experience in AR [74] (see 


KHARMA is a lightweight and open architecture for referencing 
and delivering content explicitly aiming for mobile AR applica¬ 
tions running on a global scale. It uses KML for describing the 
geospatial or relative relation of content while utilizing on HTML, JavaScript 
and CSS technologies for content development and delivery [25]. 



Microsoft announces a close cooperation with Primesense [82] , 
an Israeli company working on structured-light based 3D sen¬ 
sors, to supply their technology to ’’Project Natal”, now coined Kinect. The 
Kinect becomes commercially available in November 2010. 


Apple releases the iPad^° on April 2010, which becomes the first 
tablet computer to be adopted by the large public. The iPad 
featured an assisted GPS, accelerometers, magnetometers, advanced graphics 
chipset (PowerVR SGX535), enabling the possibilities to create efficient AR 
application on tablet computer (see Fig. 15(c)). 



At ISMAR Lukas Gruber et al. present the ’’City of Sights”, a col¬ 
lection of datasets and paperboard models^^ to evaluate the tracking 
and reconstruction performance of algorithms used in AR [20] (see 
Fig. 15(d)). 

After several delays, Microsoft releases Windows Phone in Oc¬ 
tober 2010, to become the third major mobile phone operating sys¬ 
tem to challenge iOS and Android. 


^'^Wikipedia; http: //en. wikipedia. org/wiki/IPad 

^^http://studierstube.icg.tugraz.at/haindheld_ar/cityofsights.php 


24 






Figure 15: (a): City reconstruction as used by Arth et al. [3]. (b): Panoramic 
image captured on a mobile phone using the approach of Wagner et al. [74]. 
(c) Apple iPad, (d): City-of-Sights paperboard models by Gruber et al. [20]. 
(e) In-situ information creation by Langlotz et al. [37]. 
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(a) (b) 

Figure 16: (a): KinectFusion system presented by Newcombe et al. at 

ISMAR 2011 [47]. (b): Mobile phone scene reconstrnction by Pan et al. [50]. 



Existing mobile AR applications where exclusively used to 
^ browser and consnme digital information. Langlotz et al. pre¬ 
sented an new approach aiming for AR browsers that also sup¬ 
ported creation of digital information in-situ. The information is registered 
with pixel-precision by ntilizing a panorama of the environment that is cre¬ 
ated in the background [37] (see Fig. 15(e)). 


2011 



Qnalcomm annonnces the release of its AR platform SDK 
to the public in April. At that time it is called QCAR [83], 
which will later be called Vuforia. 


In August, Google announces the acquisition of Motorola 
Mobility for about $12.5 billion [84]. A major asset of Motorola is 
a large patent portfolio, which Google needs to secure the further 
Android platform development. 



At ICCV 2011, Newcombe presents DTAM, a dense real-time 
tracking and mapping algorithm [48]. Later at ISMAR 
2011, Richard Newcombe presents the KinectFusion work 
[47] , in which depth images from the Kinect sensor are fnsed to create a single 
implicit surface model. KinectFusion becomes publicly available within the 
Kinect SDK later [85] (see Fig. 16(a)). 



Qi Pan presents his work on reconstructing scenes on mo¬ 
bile phones using panoramic images. By using FAST corners 
and a SURF-like descriptor, multiple panoramas are registered 
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and a triangulated model is created after voxel carving [50] (see Fig. 16(b)). 



Nokia N900 


Following the still challenging problem of running SLAM in 
real-time on mobiles, Pirchheim presents an approach using 
planarity assumptions, and demonstrates his approach on a 
smartphone [51]. 



Grubert et al. publish a technical report about the plausibility 
of using AR browsers [21], which becomes a survey about the 
pros and cons of AR browser technology at that point in time. 


2012 


Smart watches are broadly introduced as a new generation of mo¬ 
bile wearables. Pebble and the Sony SmartWatch are built to con¬ 
nect to a personal smartphone and to provide simple functionality, such as 
notifications or call answering. 

Google Glass (also known as Google Project Glass) is firstly pre¬ 
sented to the public^^ (see Fig. 17(b)). Goggle Glass is is an optical 
HMD that can be controlled with an integrated touch-sensitive sensor or nat¬ 
ural language commands. After it’s public announcement Google Glass had 
a major impact on research but even more on the public perception of mixed 
reality technology. 

NVidia is demonstrating at Siggraph Emerging Technologies their 
prototype of a head mounted display supporting accurate accommo¬ 
dation, convergence, and binocular-disparity depth cues (see Fig. 17(c)). 
The prototype introduces a light-field-based approach to near-eye displays 
and can be seen as a next generation wearable display technology for AR as 
existing hardware can’t provide accurate acommodation [86]. 

13th lab released the first commercial mobile SLAM (Simultaneous 
localization and mapping) system coined Pointcloud^^ to the public, 
marking a major milestone for app developers who want to integrate SLAM- 
based tracking into their application^^. 



PrimeSense, the creator of the Microsoft Kinect, introduced a smaller 
version of a 3D sensing device called Capri [87] that is small enough 


^^Google Glass project page on Google-1-: https://plus.google.com/+GoogleGlass 
^^Pointcloud homepage: http://pointcloud.io/ 

^^Pointcloud video: http://www.youtube.com/watch?v=K50KaK3Ay8U 
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Figure 17: (a): Oculus Rift developer edition, (b): 

Near-eye light field project by NVidia. 



(c) 

Google Glass, (c): 


to be integrated into mobile devices such as tablets or smartphones^^. 

At ISMAR 2012, Steffen Gauglitz et al. present their ap¬ 
proach on tracking and mapping from both general and 
rotation-only camera motion [19]. 

In August, Oculus VR announces the Oculus Rift dev kit, a virtual 
’vP reality head-mounted display. This initiated a new hype in Virtual 
Reality and in the development of more head-mounted displays for gaming 
purposes mainly (see Fig. 17(a)). 



2013 



As opposed to previous work from Gauglitz et al., Pirchheim 
et al. present an approach to handle pure camera rotation 
running on a mobile phone at ISMAR [52]. 


Google Glass, which was already announced as Project Glass in 
2012, becomes available through the explorer program in late 2013. 
and raises positive and negative attention, as well as concerns about privacy 
and ethical aspects (see Fig. 17(b)). 



At ICRA, Li et al. present an amazing approach for mo¬ 
tion tracking with inertial sensors and a rolling-shutter 
camera running in real-time on a mobile phone [38]. 



Tan et al. propose an approach to SLAM working in dynamic 
environments, allowing parts in the scene to be dynamic with¬ 
out breaking the mapping and tracking [68]. 


^®Capri Video: http://www.youtube.com/watch?v=ELTETX002zE 
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(a) (b) 

Figure 18: (a): SLAM map localization by Ventura et al. [72], (b): LSD- 

SLAM reconstruction by Engel et al. [13]. 


On November 24, 2013, Apple Inc. confirms the purchase of 
PrimeSense for about $350 million [88]. Primesense was working 
on shrinking their sensors to fit into mobiles at that point in time. 



Taskanen et al. propose an approach to perform full 3D re¬ 
construction on a mobile monocular smartphone and 

creating a dense 3D model with known absolute scale [69]. 


2014 



Three years after the acquisition, in January Google sells Mo¬ 
torola Mobility to Lenovo for $2.91 million, however, keeping most 
of the patent portfolio [89]. 

Also in January, Qualcomm acquires Kooaba [90], a Swiss ETH- 
spin-off founded in 2007, built around image recognition using SURF 
features. Kooaba’s technology is integrated into the services pro¬ 
vided through the Vuforia SDK. 

In February, Google announces Project Tango [91], which is an 
Android smartphone equipped with a full Kinect-like 3D sensor and 
hands out a few hundred units to developers and companies. 

In March, Facebook acquires Oculus VR for $2 billion, although 
Oculus does not make any consumer products at that point in time 
yet [92]. This strengthens the hype in upcoming VR interfaces. 

At VR, Ventura et al. present an approach to localize SLAM 
^ maps built on a mobile phone accurately wrt. a sparse 3D 
reconstruction of urban environments [72] (see Fig. 18(a)). 
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In April, Microsoft announces the acquisition of Nokia’s De¬ 
vices and Services unit for $7.2 billion [93], as Nokia is the primary 
vendor for Windows devices devices, especially the Lumia phones. 

Following up on previous work at ICCV 2013 [14], at 
ECCV Engel et al. present LSD-SLAM, a feature-less 
monocular SLAM algorithm using keyframes and semi- 
dense depth maps, and release the code to the public [13] (see Fig. 18(b)). 
At ISMAR, a mobile version is presented as well [64]. 




At 3DV, Herrera et al. present DT-SLAM [24]. The key idea 
behind the approach is to defer the triangulation step of 2D 
features matched across keyframes until those have undergone 
a certain baseline, improving the overall robustness of SLAM. 




At ISMAR, Salas-Moreno et al. present Dense Planar 
SLAM, leveraging the assumption that many man-made 
surfaces are planar [61]. 


In January, Microsoft announces the Hololens, a headset to fuse AR 
and VR [94] to be made available later in 2015. The device is a 
complete computer with a see-through display and several sensors. 

In May, DAQRI, a company working on AR helmets, acquires 
ARToolworks [95]. Oculus VR announced the acquisition of 
Surreal Vision, bringing the companys expertise on recreating real¬ 
time 3D representations of the outside world into virtualized environments 
[96]. A few days later, the acquisition of German AR company Metaio 
by Apple is announced [97]. Metaio was a major player in distribution of 
AR technology to developers through their SDK, which ends abruptly now. 

In June, Magic Leap announces that they will release its 
AR platform SDK to the public soon. It is expected to 
support Unity and the Unreal engine [98]. 

HTC and Valve start shipping their developer hardware in very lim¬ 
ited quantities [99], including a Vive headset and two Lighthouse 
base stations as passive components together with the two wireless 
Steam VR controllers. 




2015 
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In October 2015, Qualcomm sells its business unit in Vienna respon- 
sible for the development of the Augmented Reality SDK Vuforia to 
PTC [100] for 65 Mio. USD. 


At ISMAR 2015, the group of Graz University of Technology 
wins the best paper award for their work ’’Instant Outdoor 
Localization and SLAM Initialization from 2.5D Maps” [2], 
The algorithm uses OpenStreetMap data and calculates the pose for an image 
based on building edges and semantic image information. 
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